Search CORE

303 research outputs found

Functional Distributional Semantics

Author: Copestake A
Emerson G
Publication venue: Proceedings of the 1st Workshop on Representation Learning for NLP
Publication date: 01/01/2016
Field of study

Vector space models have become popular in distributional semantics, despite the challenges they face in capturing various semantic phenomena. We propose a novel probabilistic framework which draws on both formal semantics and recent advances in machine learning. In particular, we separate predicates from the entities they refer to, allowing us to perform Bayesian inference based on logical forms. We describe an implementation of this framework using a combination of Restricted Boltzmann Machines and feedforward neural networks. Finally, we demonstrate the feasibility of this approach by training it on a parsed corpus and evaluating it on established similarity datasets

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

Recommended from our members

Leveraging a semantically annotated corpus to disambiguate prepositional phrase attachment

Author: Copestake A
Emerson G
Publication venue: IWCS 2015 - Proceedings of the 11th International Conference on Computational Semantics
Publication date: 01/01/2015
Field of study

Accurate parse ranking requires semantic information, since a sentence may have many candidate parses involving common syntactic constructions. In this paper, we propose a probabilistic frame- work for incorporating distributional semantic information into a maximum entropy parser. Further- more, to better deal with sparse data, we use a modified version of Latent Dirichlet Allocation to smooth the probability estimates. This LDA model generates pairs of lemmas, representing the two arguments of a semantic relation, and can be trained, in an unsupervised manner, on a corpus anno- tated with semantic dependencies. To evaluate our framework in isolation from the rest of a parser, we consider the special case of prepositional phrase attachment ambiguity. The results show that our semantically-motivated feature is effective in this case, and moreover, the LDA smoothing both produces semantically interpretable topics, and also improves performance over raw co-occurrence frequencies, demonstrating that it can successfully generalise patterns in the training data.This is the final version of the article. It first appeared from Association for Computational Linguistics via http://www.aclweb.org/anthology/W15-0101

Apollo (Cambridge)

Recommended from our members

Words are vectors, dependencies are matrices: Learning word embeddings from dependency graphs

Author: Copestake A
Czarnowska P
Emerson G
Publication venue: IWCS 2019 - Proceedings of the 13th International Conference on Computational Semantics - Long Papers
Publication date: 01/01/2019
Field of study

Distributional Semantic Models (DSMs) construct vector representations of word meanings based on their contexts. Typically, the contexts of a word are defined as its closest neighbours, but they can also be retrieved from its syntactic dependency relations. In this work, we propose a new dependency-based DSM. The novelty of our model lies in associating an independent meaning representation, a matrix, with each dependency-label. This allows it to capture specifics of the relations between words and contexts, leading to good performance on both intrinsic and extrinsic evaluation tasks. In addition to that, our model has an inherent ability to represent dependency chains as products of matrices which provides a straightforward way of handling further contexts of a word

Apollo (Cambridge)

Recommended from our members

Hierarchical statistical semantic realization for minimal recursion semantics

Author: Byrne W
Copestake A
Horvat M
Publication venue: IWCS 2015 - Proceedings of the 11th International Conference on Computational Semantics
Publication date: 01/01/2015
Field of study

Apollo (Cambridge)

Recommended from our members

Ideal Words: A Vector-Based Formalisation of Semantic Competence

Author: Copestake A
Herbelot A
Publication venue: KI - Kunstliche Intelligenz
Publication date: 22/11/2021
Field of study

Funder: Università degli Studi di TrentoAbstractIn this theoretical paper, we consider the notion of semantic competence and its relation to general language understanding—one of the most sough-after goals of Artificial Intelligence. We come back to three main accounts of competence involving (a) lexical knowledge; (b) truth-theoretic reference; and (c) causal chains in language use. We argue that all three are needed to reach a notion of meaning in artificial agents and suggest that they can be combined in a single formalisation, where competence develops from exposure to observable performance data. We introduce a theoretical framework which translates set theory into vector-space semantics by applying distributional techniques to a corpus of utterances associated with truth values. The resulting meaning space naturally satisfies the requirements of a causal theory of competence, but it can also be regarded as some ‘ideal’ model of the world, allowing for extensions and standard lexical relations to be retrieved.</jats:p

Apollo (Cambridge)

Discontinuous Galerkin spatial discretisation of the neutron transport equation with pyramid finite elements and a discrete ordinate (SN) angular approximation

Author: Badalassi V
Copestake A
Eaton MD
Kophazi J
O'Malley B
Warner P
Publication venue: 'Elsevier BV'
Publication date: 02/11/2017
Field of study

In finite element analysis it is well known that hexahedral elements are the preferred type of three dimensional element because of their accuracy and convergence properties. However, in general it is not possible to mesh complex geometry problems using purely hexahedral meshes. Indeed for highly complex geometries a mixture of hexahedra and tetrahedra is often required. However, in order to geometrically link hexahedra and tetrahedra, in a conforming finite element mesh, pyramid elements will be required. Until recently the basis functions of pyramid elements were not fully understood from a mathematical or computational perspective. Indeed only first-order pyramid basis functions were rigorously derived and used within the field of finite elements. This paper makes use of a method developed by Bergot that enables the generation of second and higher-order basis functions, applying them to finite element discretisations of the neutron transport equation in order to solve nuclear reactor physics, radiation shielding and nuclear criticality problems. The results demonstrate that the pyramid elements perform well in almost all cases in terms of both solution accuracy and convergence properties

Spiral - Imperial College Digital Repository

Detecting modification of biomedical events using a deep parsing approach

Author: A Copestake
A Copestake
A Copestake
A Frank
A MacKinlay
Andrew MacKinlay
B Medlock
C Pollard
D Flickinger
David Martinez
E Briscoe
E Buyko
E Velldal
G Móra
H Kilicoglu
H Uszkoreit
I Solt
J Björne
J Hakenberg
JD Kim
KB Cohen
P Adolphs
R Farkas
S Van Landeghem
Timothy Baldwin
U Callmeier
V Vincze
WW Chapman
Y Tsuruoka
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background This work describes a system for identifying event mentions in bio-molecular research abstracts that are either speculative (e.g. <it>analysis of IkappaBalpha phosphorylation</it>, where it is not specified whether phosphorylation did or did not occur) or negated (e.g. <it>inhibition of IkappaBalpha phosphorylation</it>, where phosphorylation did <it>not </it>occur). The data comes from a standard dataset created for the BioNLP 2009 Shared Task. The system uses a machine-learning approach, where the features used for classification are a combination of shallow features derived from the words of the sentences and more complex features based on the semantic outputs produced by a deep parser. Method To detect event modification, we use a Maximum Entropy learner with features extracted from the data relative to the trigger words of the events. The shallow features are bag-of-words features based on a small sliding context window of 3-4 tokens on either side of the trigger word. The deep parser features are derived from parses produced by the English Resource Grammar and the <it>RASP </it>parser. The outputs of these parsers are converted into the Minimal Recursion Semantics formalism, and from this, we extract features motivated by linguistics and the data itself. All of these features are combined to create training or test data for the machine learning algorithm. Results Over the test data, our methods produce approximately a 4% absolute increase in F-score for detection of event modification compared to a baseline based only on the shallow bag-of-words features. Conclusions Our results indicate that grammar-based techniques can enhance the accuracy of methods for detecting event modification.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

Extremophiles in an Antarctic Marine Ecosystem

Author: Dickinson Iain
Goodall-Copestake William
Pearce David A.
Schlitt Thomas
Thorne Michael A.S.
Ávila-Jiménez Maria L.
Publication venue: 'MDPI AG'
Publication date: 01/01/2016
Field of study

Recent attempts to explore marine microbial diversity and the global marine microbiome have indicated a large proportion of previously unknown diversity. However, sequencing alone does not tell the whole story, as it relies heavily upon information that is already contained within sequence databases. In addition, microorganisms have been shown to present small-to-large scale biogeographical patterns worldwide, potentially making regional combinations of selection pressures unique. Here, we focus on the extremophile community in the boundary region located between the Polar Front and the Southern Antarctic Circumpolar Current in the Southern Ocean, to explore the potential of metagenomic approaches as a tool for bioprospecting in the search for novel functional activity based on targeted sampling efforts. We assessed the microbial composition and diversity from a region north of the current limit for winter sea ice, north of the Southern Antarctic Circumpolar Front (SACCF) but south of the Polar Front. Although, most of the more frequently encountered sequences were derived from common marine microorganisms, within these dominant groups, we found a proportion of genes related to secondary metabolism of potential interest in bioprospecting. Extremophiles were rare by comparison but belonged to a range of genera. Hence, they represented interesting targets from which to identify rare or novel functions. Ultimately, future shifts in environmental conditions favoring more cosmopolitan groups could have an unpredictable effect on microbial diversity and function in the Southern Ocean, perhaps excluding the rarer extremophiles

Multidisciplinary Digital Publishing Institute

Northumbria Research Link

Directory of Open Access Journals

PubMed Central

NERC Open Research Archive

Cascaded classifiers for confidence-based chemical named entity recognition

Author: A Copestake
A McCallum
A Vasserman
Ann Copestake
B Alex
B Carpenter
C Batchelor
D Roth
D Roth
H Ji
H Ji
H Ji
J Lafferty
J Townsend
JR Finkel
K Degtyarenko
K Yoshida
M Ashburner
P Corbett
Peter Corbett
U Reyle
V Krishnan
WJ Wilbur
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Impact assessment of micro-enterprise projects

Author: Chitere P. O.
Copestake J. G.
Johnson S.
McGregor J. A.
Njeru E.
Njoka J.
Ongile G.
Otunga R.
Publication venue: Institute for Development Studies, University of Nairobi
Publication date: 01/01/1999
Field of study

IDS OpenDocs